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Abstract 

We introduce the concept of stochastic and Markov processes and the Chapman-Kolmogorov equa- 
tions that define the latter. We show how Markov processes can be described in terms of the Markov 
propagator density function and the related propagator moment functions. Wc introduce the Kramers- 
Moyal equations and use them to discuss the evolution of the moments. Wc introduce two simple 
example processes that illustrate how Markov processes can be defined and characterized in practical 
terms. Finally, we introduce homogeneous Markov processes and show how the apparatus developed so 
far simplifies for this class. This module does not discuss more specific Markov processes-continuous, 
jump, birth-death, etc. 

1 Preliminaries 

1. "Probability Distributions," Connexions module m43336 

2 Stochastic and Markov Processes 

A stochastic process is a process that evolves probabilistically through various states attained at various 
times from a well defined initial state xq at time tg. The probability density function of the process depends 
on the states and times at which they are reached, for example, 

Pn\l ((Xn,tn) ,(x„_i,t„_i),...,(xi,fi) | (xo,to)) (1) 

is the probability that the process will reach xi at time ti, then will progress to X2 at t2, then will go 
through all the subsequent configurations listed, to reach x„ at time tn, given that it has started from xq at 
time to- Whether the process evolves between the points continuously or in jumps we don't enquire (or care) 
at this stage. But we will develop the means to specify this and it will let us say a lot about such processes, 
which is quite surprising given how general they seem to be at first sight. 

A stochastic process can be further characterized by other conditional probability densities such as 

Pn-1\2 ((x„,t„) , (x„_i,f„_i) , (X2,t2) | (xi,tl) , (xo,to)) , 
Pn-2\3 , (x„_i,i„_i) , (X3,t3) I (X2,t2) , (xi,ii) , (xo,to)) 

and so on... What is to the right of | are "spacetime" points at which the system has been. What is to the 
left are "spacetime" points which the system may visit, and it is the probability of the system doing so that 
the function describes. 
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Let us consider function 



Pi\j ((xj.ij) I (xj_i,ij_i) , (xi,fi) ,(xo,io)) (3) 

Here we state that the probability of the system reaching Xj at time tj is a function of where the system has 
been so far, tiiat is, it depends on tiie system's entire iiistory. We say tiiat a stochastic process is Markovian 
if this is not the case, that is, if the probability of the system reaching at tj depends only on where it's 
been at tj-i, but not on the previous states. A Markov process is a process that remembers only the last 
state reached. 

We express it symbolically as follows 

Pi\j I (xj_i,ij_i) , (xi,ii) , (xo,to)) 

= P{{xj,tj)\{Xj_i,tj_i)). 

This assumption simplifies the description of the corresponding stochastic processes to the point of making 
them tractable, which is why we are so interested in them. Let us consider P2|i ((x2,t2) j (xijfi) | (xo,to))- 
Clearly, this is equal the probability of the system reaching xi at ti times the probability of reaching X2 at 
t2given that it's been at xi at ti and at xq at to, that is 



-P2II ((X2, t2) , (Xi, ti) I (Xo, to)) 
= P((xi,ti) I (xo,to))Pl|2 ((X2,t2) I (Xi,ii) ,(xo,io)) • 

But if this is to be a Markov process then 



(5) 



P2\l ((X2,t2) I (Xl,il) , (xo,io)) = P((x2,t2) | (Xl,tl)) . (6) 

Consequently 



-P2II ((X2,t2) , (xi,ii) I (xo,to)) 
= P((x2,t2) I (xi,ii))P((xi,ii) I (xo,to)) • 
This extends naturally to an arbitrary number of transitions, so that 



(7) 



Pn\l ((X„,t„) ,...,(xi,ti) I (xo,io)) = YlP{{^i,ti) I (Xi_i,Vi)) . (8) 

i=l 

The magic function, P{{'Ki,ti) \ (xj_i, is called the Markov state density function. 

It is difficult not to notice here similarity to quantum mechanical processes. If Xj was to be a quantum 
particle position attained at time ti, then the probability amplitude (a complex number in general) of the 
particle progressing from xq at to through xi at ti, X2 at t2 and so on, until reaching x„ at tn — along this 
specific path — would be calculated similarly as 

n 

]J < (Xi,ii) I (xi_i,t,_i) > . (9) 

i=l 

The full probability amplitude of the particle starting from Xo at to and reaching x„ at tn would then be a 
sum of such products evaluated for all possible paths that the particle could take: 

< (x„,i„) I (xo,to) >=Y1 ( n < I (xi-i,ii-i) > j • (10) 

paths \i=l / 

The probability itself would be evaluated by taking the square of the absolute value of < (x„, i„) | (xq, to) >■ 
But for a single, specific path (8) would apply, because the square of the amplitude in this case would be a 
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product of squares of the single step amplitudes, as listed by (9). We can therefore think of a progression of a 
quantum particle along a certain specific path as a typical Markovian process. We will also find, eventually, 
that this is how the Brownian motion is described, another classic example of a Markov process. There 
is an intriguing bridge between Brownian motion and quantum mechanics, pointed to by Edward Nelson, 
a professor of Mathematics at Princeton University at the time, in 1966 — a topic we intend to explore in 
further modules. 

3 The Chapman-Kolmogorov Equation 

The Markov state density functionP {{x2,t2) \ (xi,ti)) must satisfy certain obvious properties, namely 

P((x2,i2)|(xi,ti)) >0 (11) 

and 



/ 



i^((x2,t2)|(xi,ti)) dX2 = l, (12) 

Q(X2) 

where il (X2) is the domain of X2. These derive from the above integral representing probability. The fact 
that P relates to the Markov process is reflected in the following property 

Pl\l ((Xs.ts) I (Xl,il)) = /n(x2)^2|l ((X3,i3) ,(x2,t2) | (xi,ii))(ix2 ^^^^ 
= /n(x.)^ ((^^3, ta) I (X2, ^2)) P ((X2, i2) I (Xl, tl)) rfX2. 

This is the celebrated Chapman-Kolmogorov equation. 

We are going to rewrite the equation in two ways by making the following substitutions 

forward: 



(14) 



Xl - 


xo, 


h - 


to, 


X2 - 




t2 - 


t, 


X3 - 




^3 - 


t + At, 



which yields the forward Kolmogorov equation: 

P{{x,t + At)\{xo,ta)) 



(15) 

In(a^')P{i^^t + ^t)\{^-^'^t))P{{x-x',t)\{xo,to)) dx 



backward: 



Xl - 


Xo, 


ii - 


to. 


X2 - 


xo + x\ 


t2 - 


^ to + At, 


X3 - 


-» X, 


is - 


t, 



(16) 
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which yields the backward Kolmogorov equation: 

P{{x,t)\{xo,to))= ^^^^ 
Ia(^^-)P{{x,t)\{x + x',ta + At))P{{x + x,to + At)\{xo,to)) dx 

Of course, we always assume that ti < t2 < ts and that the same holds for the substitutions. 

Nothing stops us from inserting more intermediate points into the progression of the observed system 
through the Kolmogorov steps, which leads to the compounded Chapman-Kolmogorov equation 

P((x„,t„)|(xo,io)) = ^^g^ 
/n(x„_i)-/o(xi)nr=i^(('«^i'*i)IK-i'*i-i)) rfxi...dx„_i. 

4 Moments of Markov State Density Function 

The moments of the Markov state density function are computed as for any other probability density, namely 

<a;">=/ x''P{{x,t)\{xQ,tQ)) dx, (19) 

and similarly we do with variance and standard deviation: 

var (x) = cr^ (x) =< {x- < x >)^ > . (20) 
We can also compute the first term in the covariance of the last two positions in the Markov chain: 



< 



X2Xi>= / / XiX'2PiiX'2,t2)\ixi,ti)) P{{xi,ti)\{xo,to)) dXidX2. (21) 

J n{x2')J n(xi) 

Before we go any further then, let us first brush up on some properties of the moments, variances, covariances 
and standard deviations: 

< X- < X »= 0, 

var (x) = (x) =< {x- < x >)^ >=< x"^ > -< x (22) 

< >> < a; 

where the equality holds for sure variable only. Some elementary properties of covariances and correlations 
are: 

cov {x, y) =< {x— < X >) {y— < y >) >=< xy > — < x >< y >, 
a{x)a{y)>\cov{x,y)\, 

/ \ cov(x,y) ^ ' 

corr{x,y) = ^^^y 
1 > \corr {x, y)\ 

Also, we observe that statistically independent variables x and y, that is, variables such that P^y {x, y) = 
Px {x) Py (y) and uncorrelated, that is cov {x, y) = 0. But uncorrelated variables do not have to be statistically 
independent. 
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5 The Markov Propagator 

The observed similarity between Markov processes and quantum mechanics should have prepared us for 
what's coming now. 

A Markov propagator density function is the Markov state density function that yields probability density 
at <;+ dt, where dt is an infinitesimal increment, given that the system has been at x at time t, that is 

P((x + x',i + di) |(x,t)), (24) 

where x', unlike dt, is not infinitesimal — for example, the state may have jumped in the time dt to somwhere 
quite far away from x. We consider it a function of x' , parametrized by the initial state (x, t) and labelled 
by dt and we employ the impressively looking capital pi, 11, to denote it: 

U{x\dt\{x,t)) = P{{x + x,t + dt)\{x,t)) . (25) 

We can think of it as a notational shortcut for (24). The notation here reflects that of quantum mechanics. 
We can read the \dt\ construct as a device that implements the infinitesimal time advance, applied to the 
initial state (x,t). After the application of the device, we ask about the probability of the system drifting 
from X by x'. 

Being itself a probability density, 11 must satisfy 

n(x'|di|(x,t)) >o, ^^^^ 
/a(x)n(x'|rfi|(x,t)) dx' = l. 

For = we must have 

n(x'|rft = 0|(x,t)) =(5(x'). (27) 

In this case, the time advance device \dt\ does not advance the time at all, so the system in question must 
remain at x. 

Being the Markov state density function, the propagator density function must also satisfy the Chapman- 
Kolmogorov equation, which in this case is usually written in the following form 

n {x'\dt\ (x,f)) = 

/j^(^„)n(x' -x"| (1 -a) dt\ (x + x",t + adi)) (28) 
n {x"\adt\ {x,t)) dx", 
where a € 0, 1 [ , which follows directly from the evaluation of 

P((x + x',t + rfi) |(x,i)) (29) 

through an intermediate point x + x" at t + adt. 

The first step in (28) advances the state from its origin at (x, t) by the infinitesimal time machine of 
\adt\, and the state deflects by x". The second step then commences with the state at x + x" and the time 
advanced to t + a dt. We apply again the time machine that advances the state by the remainder of dt, that 
is, by (1 — a) dt and the state ends up deflected by x from the originalx. But this is not the starting point 
of this propagator. The starting point is x + x", so its end point of x + x' must be recomputed in reference 
to x + x", which is 

x + x— X — X =x— X. (30) 
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Since n (x'|(it| (x,t)) is a probabilitj' density of x', the latter is its random variable. And it is this random 
variable, here denoted by the ordered pair that associates the probability density with it, 

(x',n(x'|rfi|(x,t))) (31) 

that we call the Markov propagator. 

(25) reminds us that we may think of it as x' = x{t + dt) — x (i), which makes it a sort of a differential. 
But it is a random variable differential that is sensitive to the changes in the probability density across the 
dt, not just dx (t). So we should really write this more accurately as: 

(x, P^. (x')) = (x, (x, t + dt))- (x, (x, t)) . (32) 

The Random Variable Transformation theorem provides us with a formula for the probability density of 
variables that result from some functional operation on other random variables. The formula is 

Py{yi,-,ym) = ^gg^ 
)nlli'^(y» - fi{xi,-,x„)) dxi...dxn, 

where fi (xi, ...,Xn) are functions that transform Xi into yj (we do not insist on m = n), Px is a combined 
multivariate probability density of Xi,...,Xn, and Py is the probability density of the yi,...,ym random 
variables produced by the operations /j. 6 is the Dirac delta function. 

The Chapman- Kolmogorov equation for the propagator density function, (28), can be rewritten to reflect 
the Random Variable Transformation formula as follows 



n(x'|rft|(x,t)) = 

/n(xo/n(x.)n(x2|(l-a) dt| (x + xi, i + ad<)) (34) 
n (xijad^l (x, t)) 6 (x' — xi — X2) dxi dx2, 

which demonstrates that the random variable operation that is being performed here is Xi + X2. It is also 
in this sense that we should understand x' = x (t + dt) — x (t). 

n representing in some way a differential would correspond to something like a derivative if we were to 
divide it by dt 

Il{x\dt\{x,t)) 
dt • 

This is no longer a probability density, because unlike (26) it does not integrate to 1 over the domain of x'. 
But it is still everywhere positive and we may associate moments with it. Assuming x and x' to be scalars, 
we define 

Ti / ,^ r f f '^n ll{x'\dt\{x,t)) ^ , 

n„ [x, t) = hm f [x ) — dx . (36) 

<i*— "J n{x') dt 

n„ {x, t) is called the n-th propagator moment function of the Markov process described by (24). The above 
definition does not imply that 

Iln{x,t)dt=l {xY'^{x\dt\{x,t)) dx . (37) 
J n(x') 

What it implies is that 

n„ {x,t) dt= (a;')"n {x'\dt\ {x,t)) dx + O {{dtf \ . (38) 

J n(x') ^ ^ 
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Upon division of both sides by dt and upon taking the limit dt ^ the small term of the second and higher 
orders in dt, O (j^dt)"^^, disappears. 

We've been careful to refer to II (x \dt\ {x,t)) /dt as "something like a derivative." This is because a 
well defined time derivative for a real, genuinely stochastic Markov process does not exist. A trajectory of 
Brownian motion, for example, is clearly not different iable. This is just one such example. We can formalize 
this as follows. We begin with 

x(i + dt) -x(t) w<x' > ±c7(x') . (39) 
Now we switch to 1-D and make use of the propagator moment functions. Clearly, from (38) 



<x >=ni{x,t)dt + o({dtf^ (40) 



and so 



cr^ (x) = < x"^ > -< x >' 

= n2{x',t) dt + o(^{dtf^ - (ni{x',t) dt + o (^{dtf^y . 



(41) 



Therefore 



a{x') = ^U2{x',t) dt + O (^{dtf^ - (lli{x',t) dt + O (^{dtf^y 

^yU2 {x',t) dt + O (^{dtf^ . 



(42) 



xit + dt)-xit) ^ ^ J^^± (43) 



Prom this we get 



dt ^ ^ \ dt dt 

We see now that the second term, the one that contains 112, explodes as dt 0, unless 112 is zero as well. 
But 112, which is related to the variance, is zero only if x' is a sure variable, therefore not representing a 
genuinely stochastic Markov process. 

The compounded Chapman-Kolmogorov equation (18) can be rewritten in terms of propagator densities 
assuming that the interval [to,t] is subdivided into a large number n of infinitesimal segments dt. Then 

P{{x,t)\{xo,to))= ^44^ 
/a(xi)-/a(x„_i)nr=in(xi|rfi| (xi_i,ti_i)) dxi...dxn-i. 

In principle, this equation lets us reconstruct the Markov process from the knowledge of the propagator 
density function. It can be thought of as a Markovian equivalent of the Schrodinger equation, where the 
Hamiltonian plays a similar role. Equations (26), (27) and (28) supplemented by a small number of addi- 
tional requirements lead to tractable expressions for the propagator density functions, again in similarity to 
known expressions for Hamiltonians. The procedure then makes the resulting Markov processes tractable 
analytically and numerically. It is most surprising how much can be inferred about them starting from simple 
assumptions. 



6 The Kramers-Moyal Equations 

The Kramers-Moyal Equations are partial differential equations for the Markov state density function ex- 
pressed with the help of the propagator moment functions, defined by (36). They follow directly from the 
forward (15) and backward (17) Kolmogorov equations. 
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forward: Our starting point is the 1-dimensional forward Kolmogorov equation 

P{{x,t + At)\{xo,to))= ^^^^ 
In(x)P{{x,t + At)\{x-x',t))P{{x~x',t)\{xo,to)) dx 

We introduce an auxiliary function 

f{x) = P{{x + x,t + At)\ {x, t)) P {{x, t) I {xo, to)) . (46) 

and observe tiiat the expression under the integral in the forward Kolmogorov equation (45) is 
f (x — x), which can be expanded in the Taylor series around x, if / is analytic: 

(-x')" d-f{x) 



/(x-.>/(x.) + E^^- (47) 

We substitute this into (45) with the following effect 

P{{x,t + At)\{xQM)) = 
/nOr-)^((-^ + '^''* + ^*) 1(2^0, to)) dx (48) 

+ Er=i )(^')"^((^ + ^''* + ^*) I {x,t))P{{x,t) I {xoM) dx. 

Let's have a look at the first integral, x appears only in the first P term. This, therefore, is 
a normalization integral which evaluates to 1 times the second P term. The integrals in the sum 
similarly depend on x\ which appears only in the first P in the integrated function. The second P is 
therefore a coefficient that can be put in front of the integral. We subract P{{x,t) \ {xo,to)) from both 
sides and divide both side by At which yields 

P{{x,t+At)\{xo,to))-P{{x,t)\{xo.ta)) ^ 

E~ , C''''T"'"'' lna-TPii- + -\t + ^t) I ix,t)) dx) . ^''^ 

Now we take a limit At — > 0. In this limit the integral in the sum, upon its division by At becomes 



/ 



(xr'^^«a = n„ (.,,). (50) 

n(x') "I 



the definition we have already introduced in (36). This, finally, leads to the forward Kramers-Moyal 
equation 

-P ((X, t) I [xo, io)) = E ^ ^ (n„ [x, t) P iix, t) I [xo, to))). (51) 

n=l 

backward: Our starting point is the 1-dimensional backward Kolmogorov equation 



(52) 



P{{x,t)\{xo,to)) = 
!n{x)P ii^'*) I (^0 + x',to + At))P {{xo + x',to + At) \ {xo,to)) dx 

We introduce an auxiliary function 

f{xo) = P{{x,t)\{xo,to + Ato)) (53) 
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and observe that the RrstP under the integral in the backward Komogorov equation (52) is / [xq + a;'), 
which can be expanded in the Taylor series around Xq, if / is analytic: 

We substitute this into (52) with the following effect 

P{{x,t)\{xoM)) = 
P{{x,t)\{xo,to + Ato))Ji^^-^P{{xo + x\to + Ato){xo,to))dx' 

+ Y.°:=,h {£-P ii^^ t) I (a^o, to + Ato))) ^^^^ 
In(x){^'TP {{^0 + x' ,to + Ato) \ ixo,to)) dx . 

We observe that the integral in the first addend is a normalization integral for P and therefore equal 
to 1. This leaves P{{x,t) (xo,to + Ato)) alone and we transfer it to the left side of the equation and 
divide both sides by Ato. In the limit of Ato 0, this yields minus the derivative of P on the left 
side. The right side of the equation is left with the sum only and the integral turns into the moment 
integral of the Markov propagator density function, which, upon the division by dt we call n„ {xo,to), 
as per (36). The dt being used so can no longer be used again to play with Ato in the differentiated 
P. Instead 

P {{x, t) I (a;o, to + Ato)) ^-^^ P {{x, t) \ {xo, to)) • (56) 
In summary we end up with the backward Kramers-Moyal equation: 

d °° 1 9" 

^—P{{x,t) I ixo,to)) = ^n„ {xo,to) —P{{x,t) I (a;o,io)) • (57) 

The thing to observe is that the n„ coefficients are differentiated together with P in the forward Kramers- 
Moyal equation, but not in the backward one. There is also the (—1)" factor in the forward equation, but 
not in the backward one, and the sign in front of the time derivative is negative in the backward equation. 

Finally, looking at both Kramers-Moyal equations, it is easier to understand why one is called forward 
and the other backward. This is not so clear when looking at the original Kolmogorov equations. In the 
forward Kramers-Moyal equation, the {xo,to) pair is a fixed parameter and the differentiation is over t and 
X and proceeds forward in time. The initial condition for the equation is 

P {{x, t = to)\ {xo,to)) = 6{x-xo). (58) 

In the backward Kramers-Moyal equation, the (x, t) pair is a fixed parameter and the differentiation is over 
to and Xo and proceeds backward in time. The initial condition for the equation is 

P{{x,t)\{xo,to = t))=S{x-Xo). (59) 



7 Evolution of the Moments 

We're going to do it all in 1-D. Our starting point is 

x{t + dt) = x{t) + x. (60) 
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Hence 

71 

x^'it + dt) = {x{t)+xy" = x"{t) + J2Cl)^"~^W^'''- (61) 

fe=l 

Now we average both sides of the equation 

n 

< x" (i + dt) >=< (i) > + X] < ^""'^ (*) ^'^>- (62) 



fc=i 



The probabiHty density of x" (i) is P{{x,t) \ {xo,to)) and the probability density of x is U{x'\dt\{x,t)). 
Therefore the combined probability density of a;""*' (t) x ^ is 

M{x\dt\{x,t))P{{x,t)\{xQM)) (63) 
and the integration to produce < x^ (t) x^ > for some j and A; must run over x and x : 



< x^ it) X >= 

I n{x)I nix ^' ^ i^' \^^\ i^^'t^)) Pii^^t) I ixa,ta)) dxdx = 
/o(x)2^'nfc (x, t) P ((x, t) I (xo, to)) rfx + O ({dtf') = 

< x^ (t) Uk (x (t) , t) > dt + O {{dtf^ . 



(64) 



Let us go back to (62). Wc subtract < x" (t) > from both sides, substitute (64) in place of < x" (t) x' > 
and divide both sides by dt, which yields 

I < x" (t) >= X] (^) < (i) (x (t) ,t) > . (65) 

fe=i 

And this is our equation for the evolution of the moments of the Markov process x (t) that does not use 
explicitly the Ps or the Us. But, of course, the propagator density is hidden inside the propagator moment 
functions !!„. The initial condition for the equation is 

<x"(to)>=x^. (66) 



7.1 Mean, Variance and Covariance 

The equation for the evolution of the mean of the Markov process is trivial, 

^ <x(t) >=<ni(x(t),t) >, (67) 
and follows directly from (65). The initial condition for this equation is 

< x(t = to) >= a;o. (68) 

For < x^ >(65) yields 

^ < x^ (t) >= 2 < X (t) Hi (x (t) , t) > + < U2 (x (t) , t) > . (69) 
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Prom this and from (67) we obtain 

ivar [x {t)) = i{<x^ (t) >~<x (t) >') = 
2 < a; (t) Hi (x (t) ,t) > + < U2 {x (t) ,t) > -2 < x (t) >< Ui {x {t) ,t) > . 
The initial condition for this equation is 



(70) 



var{x{t^to)) ^0. (71) 

Covariance is a function of two variables, for example, cov {x {ti) , x (^2))- Here we are going to evaluate its 
derivative with respect to t2- 

^cov {x (ti) , X (ta)) ^^{<x (ti) X (ia) >- <x (ti) >< x (ia) >) 



< x (ti) X (i2) > - < X (ti) > ^ <x (ta) > (72) 
< X (ti) X [h) >-<x ih) >< Hi (x (tz) , t2) > 



The first component of the sum on the right side of the equation still requires some work. We proceed as 
follows 

X (tl) X {t2 + dt2) = X {ti) [x {t2) + X {t2)) = X {ti) X (^2) + X {ti) X (^2) , (73) 

the average of which is 

<x{ti)x{t2)> + <x{ti)x {t2)>, (74) 

where 

< X (ii) X {t2) >= ^^^^ 

/a(xi)/n(a:2)^i^2-P((a;2,i2) I {xi,ti)) P {{xi,ti) \ {xo,to)) dxidx2. 

Therefore 

< X (h) X {t2 + dt2) > - <X iti) X {t2) > = < X (ti) x' (^2) > • (76) 

The right side of this equation is somewhat tricky, because here we have to average x (12)- We do this as 
follows 

< x{ti)x' {12) > 

= /o(xi)/nfe)/o(c.')2^l2;'n {x'\dt2\ {X2,t2)) 

P{{x2,t2) I {xi,ti)) P{{xi,ti) I {xo,to)) dx dx2dxi 
= /a(x.)/n(x.)^i (ni (2=2, i2) dt2 + O [{dt2f)) 
P{{X2,t2) I {xi,ti))P{{xi,ti) I {xo,to)) dX2dXi 
=< X (ii) Hi (a; {t2) , t2) >dt2 + ((dfs)^) . 
Substituting this result into (76) and dividing both sides by dt2 yields 

-^<x{h)x (ta) >=< X (h) Hi {x {t2) ,t2)>. (78) 
at 2 

Now we plug this result into (72) which yields 

-J-cow (a; (ii) , a; (ts)) =< x {h) Hi {x {t2) ,t2) > - < x {h) >< Hi {x (ia) , ^2) > • (79) 

0,12 



(77) 
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The initial condition for this equation is 

cov [x (ti) ,x{t2 — ti)) = var {x {ti)) . (80) 

It is useful to note that when Hi {x,t) does not depend on x, then it falls out of the <> brackets on the 
right side of (79) which makes the right side zero. Thus cov {x (ti) , x {t2)) ends up independent of t2 and so 
it must remain set to the initial condition, that is, var{x{ti)). 

Looking at (67), (70) and (79) and the related initial conditions (68), (71) and (80) we find that these 
equations seem not only eminently tractable, but even relatively simple — depending, that is, on the n„ {x,t) 
functions. Thus, by persistent chipping at the problem, we have progressed from the initial view of stochastic 
processes that was intimidating, to say the least, to quite tractable equations that describe the evolution of 
the mean, variance and covariance of the Markov process x{t). 

To illustrate this point, we are going to look at two simple examples that happen to be applicable to 
some Markov processes of interest. The examples also illustrate how we would use the propagator moment 
functions of the Markov process to specify it. 

Hi {x, t) = V, II2 {x, t) = 7: where v and 7 > are constants. In this case < Hi >= v and < 112 >= 7 and 
the equations that describe the evolution of the mean, the variance and the covariance plus their initial 
conditions are 

i<xit)> = V 
< x{t = to) > = Xo 
Avar (x (t)) = 7 
var {x {t = to)) = 
^^cov{x{ti),x{t2)) = 0. 
cov {x {ti) , X {t2 = ti)) = var{x{ti)). 
The derivative of the covariance is zero, because Hi is a constant, which means that 

cov {x (ti) , X (i2)) = var {x (ti)) . (82) 

The solution to this problem is therefore 

< x{t) > = Xo + v{t — to) 

var{x{t)) = -f{t-to) (83) 
cov {x (ti) , X {t2)) = 7{ti-to)- 

For 7 = the variance and the covariance remain zero and the process becomes deterministic. 
Hi {x,t) = —Ax, 112 {x,t) = 7: where A > and 7 > are constants. This example is a little more compli- 
cated. The equations that govern it are as follows 

<x{t)> = -A < X (i) > 
< X (i = io) > = Xo 



A 

dl. 

var {x {t ~ to)) = 



jivar{x{t)) = -2Xvar{x{t)) + -i ^^^^ 



^COv{x{t\) ,x{t2)) = -XcOv{x{ti) ,x{t2)) 

cov {x {ti) , X {t2 = ti)) = var{x{ti)). 
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Before we go any further, we're going to explain how these equations come about. The first one is 
obvious. The equation for the variance is obtained by substituting our specific expressions for Hi and 
112 in (70) which yields 



dt 



var 



{x{t)) = 2 < a;(t) (-Ax(t)) > +7-2 < >< -Aa;(t) > 

-2X(^< (t) > -< x{t) +j (85) 
= -2Xvar {x {t)) + 7. 

The equation for the covariance is obtained by substituting Hi and 112 as defined above in (79) which 
yields 

^COv{x{ti) ,x{t2)) = < X{ti) {-Xx{t2)) > - < X{ti) X -Xx{t2) > 

-X{< x{ti)x{t2) > - < x{ti) >< x{t2) >) (86) 
= -Xcov{x{ti) ,x{t2)) ■ 

The solution of the first equation in (84) is of the form e~^*. The initial condition forces the following 
choice of constants 

< x(i) a;oe-^(*-*«). (87) 

The variance equation would be like the mean equation were it not for the non-homogeneous term 7. 
We deal with this by postulating a solution of the form 

var {x {t}) = Ae-^^^*-*")/ (t) , (88) 

where A is a constant and 

I (Ae-2M*-to)/ (i)) = ^e-2M*-to) (^1/ (i) _ 2Xf (t)) (89) 

Subsituting this solution into the equation for the variance yields 

^g-2A(t-to) ^±^f _ 2A/ {t)^ = -2AAe-2^(*-*°)/ (i) + 7 (90) 



We add 2XAe '"^Z (t) to both sides of the equation, which kills this term and we are left with 

e2A(t-to)^ (91) 



(^f W _ 7 ,2A(t-to) 



dt A 

which solves to 



/(i)=^e2^(*-*'') + B, (92) 



where B is another constant. We substitute this into (88) and obtain 

var{x{t)) = Ae-2^(*-*«) (2^e2^(*-*°)+B) 



(93) 

2A 



leaving us with just one constant AB as should be expected. For t = to the exp function is 1 and we 
obtain 

var {x {t = to)) = ^+AB = (94) 
2A 

which implies that AB = —7/ (2A). Therefore 

var {x (t)) = ^ (1 - e-^^^*-*"') . (95) 
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The covariance equation is just like the mean equation. But here the initial condition at = ti sets 
the covariance to variance of x (ti). Consequently the solution is 

cov {x (ti) , X {t2)) = var {x {ti)) e-^(*2-*i) (96) 

But we have the expression for the variance in the form of (95). Plugging it into the above solution 
yields 

cov {x (ti) , X (ta)) = ^ (l - e-2^('i-'«) j ^-^^'^-''1 (97) 

In summary 

<x{t) > = a;oe-^(*-*») 

var{x{t)) = 2A (1 - e"^-^^*"*"^) (9^) 

cov{x{h),x{t2)) = 2^ (l-e-2M«i-to))e-Mt2-ti). 



8 Homogeneous Markov Processes 

Markov processes are said to be homogeneous if the corresponding Markov process density function does not 
depend on 

time: that is 

n (x'|rft| (x,i)) = n (x'|dt|x) , (99) 

in which case they are called temporally homogeneous; 
space: that is 

n{x\dt\(yi,t)) =Il{x\dt\t) , (100) 

in which case they are called spatially homogeneous; 
time and space: that is 

n(x'|(ii|(x,t)) =n(x'|di|) , (101) 

in which case they are called completely homogeneous. 

Brownian motion is an example of a completely homogeneous process. A great deal can be said about such 
processes, because the equations that describe them are simpler and their certain properties can be seen 
right away. 

One such important property of temporally homogeneous Markov processes, and by extension also of 
completely homogeneous ones, is that the probability density P{{x,t) \ {xo,to)) depends on time through 
t — to only, in other words 

P {{x, t) I (a;o, to)) = P ((x, t-to)\ (xq, 0)) . (102) 
From this it follows immediately that 

-P((x,t) I (xo,to)) = -^P((x,i) I (xo,to)) (103) 

Because the propagator density does not depend on time, the propagator moment functions n„ do not 
depend on time either 



n„ {x) = ^ I x'"n {x\dt\x) dx . (104) 
a(x') 



http://cnx.Org/content/m44014/l.20/ 



Connexions module: m44014 



15 



Consequently, the Kramers-Moyal equations simplify too, with the left side of the equations fully inter- 
changeable on account of (103) 

-i-^P{{x,t) I {xo,to)) = En=l {Xo) £nP{{x,t) I {xo,to)) • 

If the process is completely homogeneous, the above equations apply, but we can simplify even more. Let us 
divide [t, to] into n infinitesimal intervals, each of length dt, then we can write the compounded Chapman- 
Kolmogorov equation as follows 

P{{x,t)\{xo,to))= [ ...[ f\u{x'j\dt\), (106) 

where 

Xn = X-Xo- Ei=l X^^X-Xo = Ei=l ^Q^^ 
dt = 

The integrals run over x\, .... x[^_i^, but not over x[^. We can add the latter by making use of the above 
expression for x — xq and inserting the corresponding Dirac delta in (106), then we make use of the fact that 
n (a;^|(ii|) depend on their ownx] only and obtain 

P{{x,t) I {xo,to)) = n;=i /n(x;.)n {<\dt\) d{x-xo- Eti dx). (108) 
The next step in gaining further insights is to express the Dirac delta in the form of the Fourier transform: 



oo 



277 

which in this case becomes 



1 /"^ 

' gife(a;— Xo) 



5[x)= ^ I e'^'' dk, (109) 



2tt 



Y[e-'''''i dk. (110) 

We substitute this into (108), also assume that for each j, CI (^x'j) = [—00,00], which yields 

Piix,t)\ixo,to)) = i^jr^e''^(^-^<^^[jZ^{x'\dt\)e-^>^^'dx'ydk 

2^/-°'ooe^'''"'°^(n(fc|^^i|)) dk, 

where 



(111) 



^ POO 

U{k\dt\)= / U{x'\dt\)e-'''''' dx' (112) 

J —CO 

is the Fourier transform of 11. 

Looking at (111) we see that not only does P {{x, t) \ (a;o, to)) depend on t — to, as per (107). It depends 
on X through x — Xo too. Hence 

P {{x, t) I {xo, to)) =P{{x- xo, t-to)\ (0, 0)) , (113) 
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wherefrom 

dP _ dP 

dt dto (114) 

dP _ _ap 

dx dxQ ' 

For the completely homogeneous Markov processes the propagator moment function n„ {x, t) no longer 
depend on x or t. They are therefore constants, n„. In combination with (114), this reduces the two 
Kramers-Moyal equations to just one 

§-,P{{x,t) I {xo,to)) = Er=l ^-=^^n£,P{{x,t) I {xo,to)) . (115) 

The equation for the evolution of the moments, (65), similarly simplifies to 



n 

<x^{t)>=J2(2)nk<x"-''{t)> . (116) 



di 

k=l 
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